Feature ranking for semi-supervised learning
نویسندگان
چکیده
Abstract The data used for analysis are becoming increasingly complex along several directions: high dimensionality, number of examples and availability labels the examples. This poses a variety challenges existing machine learning methods, related to analyzing datasets with large that described in high-dimensional space, where not all have provided. For example, when investigating toxicity chemical compounds, there many compounds available can be information-rich representations, but information on their toxicity. To address these challenges, we propose methods semi-supervised (SSL) feature rankings. rankings learned context classification regression, as well structured output prediction (multi-label classification, MLC, hierarchical multi-label HMLC multi-target MTR) tasks. is first work treats task ranking uniformly across various tasks prediction. best our knowledge, it also SSL MTR. More specifically, two approaches—based predictive clustering tree ensembles Relief family algorithms—and evaluate performance 38 benchmark datasets. extensive evaluation reveals based Random Forest perform (incl. MLC tasks) fastest tasks, while extremely randomized trees regression Semi-supervised outperform supervised counterparts majority different showing benefit using unlabeled addition labeled data.
منابع مشابه
Label Ranking with Semi-Supervised Learning
Label ranking is considered as an efficient approach for object recognition, document classification, recommendation task, which has been widely studied in recent years. It aims to learn a mapping from instances to a ranking list over a finite set of predefined labels. Traditional solutions for label rankings cannot obtain satisfactory results by only utilizing labeled data and ignore large amo...
متن کاملData Ranking in Semi-Supervised Learning
The real challenge in pattern recognition tasks and machine learning processes is to train a discriminator using labeled data and use it to distinguish between future data points as accurate as possible. However, most of the problems in the real world have numerous data. Therefore assigning labels to every data points in these problems are a cumbersome or even impossible matter. Semi-supervised...
متن کاملSemi-supervised Ranking Pursuit
We propose a novel sparse preference learning/ranking algorithm. Our algorithm approximates the true utility function by a weighted sum of basis functions using the squared loss on pairs of data points, and is a generalization of the kernel matching pursuit method. It can operate both in a supervised and a semi-supervised setting and allows efficient search for multiple, near-optimal solutions....
متن کاملSemi-Supervised Ensemble Ranking
Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called meta-search, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning techniques to learn appropriate weights for combining multiple rankers. The main shortcoming with th...
متن کاملMultiview Semi-supervised Learning for Ranking Multilingual Documents
We address the problem of learning to rank documents in a multilingual context, when reference ranking information is only partially available. We propose a multiview learning approach to this semisupervised ranking task, where the translation of a document in a given language is considered as a view of the document. Although both multiview and semi-supervised learning of classifiers have been ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2022
ISSN: ['0885-6125', '1573-0565']
DOI: https://doi.org/10.1007/s10994-022-06181-0